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SEQUENCE LISTING 



( 1 ) GENERAL INFORMATION : 



(i) APPLICANT: 

(A) NAME: BASF Aktiengesellschaf t 

(B) STREET: Carl-Bosch-Strasse 38 

(C) CITY: Ludwigshafen 

(D) STATE: Rheinland Palatinate 

(E) COUNTRY: Federal Repulic of Germany 

(F) POSTAL CODE: D-67056 

(ii) TITLE OF APPLICATION: A process for preparing chiral carboxylic 
acids from nitriles using a nitrilase or microorganisms which comprise a gene 
for the nitrilase 

(iii) NUMBER OF SEQUENCES: 9 

(iv) COMPUTER-READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.2 5 (EPO) 
(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1071 Base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS : Double 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) [sic] ANTISENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Alcaligenes faecalis 

(B) STRAIN: 1650 

(ix) FEATURES: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1071 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

ATG CAG ACA AGA AAA ATC GTC CGG GCA GCC GCC GTA CAG GCC GCC TCT 48 
Met Gin Thr Arg Lys lie Val Arg Ala Ala Ala Val Gin Ala Ala Ser 
1 5 10 15 
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CCC AAC TAC GAT CTG GCA ACG GGT GTT GAT AAA ACC ATT GAG CTG GCT 96 

Pro Asn Tyr Asp Leu Ala Thr Gly Val Asp Lys Thr lie Glu Leu Ala 

20 25 30 

CGT CAG GCC CGC GAT GAG GGC TGT GAC CTG ATC GTG TTT GGT GAA ACC 144 
Arg Gin Ala Arg Asp Glu Gly Cys Asp Leu lie Val Phe Gly Glu Thr 
35 40 45 

TGG CTG CCC GGA TAT CCC TTC CAC GTC TGG CTG GGC GCA CCG GCC TGG 192 
Trp Leu Pro Gly Tyr Pro Phe His Val Trp Leu Gly Ala Pro Ala Trp 
50 55 60 

TCG CTG AAA TAC AGT GCC CGC TAC TAT GCC AAC TCG CTC TCG CTG GAC 24 0 

Ser Leu Lys Tyr Ser Ala Arg Tyr Tyr Ala Asn Ser Leu Ser Leu Asp 
65 70 75 80 

AGT GCA GAG TTT CAA CGC ATT GCC CAG GCC GCA CGG ACC TTG GGT ATT 2 88 

Ser Ala Glu Phe Gin Arg lie Ala Gin Ala Ala Arg Thr Leu Gly lie 

85 90 95 

TTC ATC GCA CTG GGT TAT AGC GAG CGC AGC GGC GGC AGC CTT TAC CTG 336 
Phe lie Ala Leu Gly Tyr Ser Glu Arg Ser Gly Gly Ser Leu Tyr Leu 

100 105 HO 

GGC CAA TGC CTG ATC GAC GAC AAG GGC GAG ATG CTG TGG TCG CGT CGC 384 
Gly Gin Cys Leu lie Asp Asp Lys Gly Glu Met Leu Trp Ser Arg Arg 
115 120 125 

AAA CTC AAA CCC ACG CAT GTA GAG CGC ACC GTA TTT GGT GAA GGT TAT 432 
Lys Leu Lys Pro Thr His Val Glu Arg Thr Val Phe Gly Glu Gly Tyr 
130 135 140 

GCC CGT GAT CTG ATT GTG TCC GAC ACA GAA CTG GGA CGC GTC GGT GCT 4 80 

Ala Arg Asp Leu lie Val Ser Asp Thr Glu Leu Gly Arg Val Gly Ala 
145 150 155 160 

CTA TGC TGC TGG GAG CAT TTG TCG CCC TTG AGC AAG TAC GCG CTG TAC 52 8 

Leu Cys Cys Trp Glu His Leu Ser Pro Leu Ser Lys Tyr Ala Leu Tyr 

165 170 175 

TCC CAG CAT GAA GCC ATT CAC ATT GCT GCC TGG CCG TCG TTT TCG CTA 57 6 

Ser Gin His Glu Ala He His He Ala Ala Trp Pro Ser Phe Ser Leu 

180 185 190 

TAC AGC GAA CAG GCC CAC GCC CTC AGT GCC AAG GTG AAC ATG GCT GCC 624 
Tyr Ser Glu Gin Ala His Ala Leu Ser Ala Lys Val Asn Met Ala Ala 
195 200 205 

TCG CAA ATC TAT TCG GTT GAA GGC CAG TGC TTT ACC ATC GCC GCC AGC 672 
Ser Gin He Tyr Ser Val Glu Gly Gin Cys Phe Thr He Ala Ala Ser 
210 215 220 





0050/49462 



30 

AGT GTG GTC ACC CAA GAG ACG CTA GAC ATG CTG GAA GTG GGT GAA CAC 720 
Ser Val Val Thr Gin Glu Thr Leu Asp Met Leu Glu Val Gly Glu His 
225 230 235 240 

AAC GCC CCC TTG CTG AAA GTG GGC GGC GGC AGT TCC ATG ATT TTT GCG 768 
Asn Ala Pro Leu Leu Lys Val Gly Gly Gly Ser Ser Met lie Phe Ala 

245 250 255 

CCG GAC GGA CGC ACA CTG GCT CCC TAC CTG CCT CAC GAT GCC GAG GGC 816 
Pro Asp Gly Arg Thr Leu Ala Pro Tyr Leu Pro His Asp Ala Glu Gly 

260 265 270 

TTG ATC ATT GCC GAT CTG AAT ATG GAG GAG ATT GCC TTC GCC AAA GCG 864 
Leu He He Ala Asp Leu Asn Met Glu Glu He Ala Phe Ala Lys Ala 
275 280 285 

□ ATC AAT GAC CCC GTA GGC CAC TAT TCC AAA CCC GAG GCC ACC CGT CTG 912 

0 He Asn Asp Pro Val Gly His Tyr Ser Lys Pro Glu Ala Thr Arg Leu 
B 290 295 300 

P GTG CTG GAC TTG GGG CAC CGA GAC CCC ATG ACT CGG GTG CAC TCC AAA 960 

3 Val Leu Asp Leu Gly His Arg Asp Pro Met Thr Arg Val His Ser Lys 

305 310 315 320 

L. AGC GTG ACC AGG GAA GAG GCT CCC GAG CAA GGT GTG CAA AGC AAG ATT 1008 
* Ser Val Thr Arg Glu Glu Ala Pro Glu Gin Gly Val Gin Ser Lys He 
t 325 330 335 

£! GCC TCA GTC GCT ATC AGC CAT CCA CAG GAC TCG GAC ACA CTG CTA GTG 1056 
II Ala Ser Val Ala He Ser His Pro Gin Asp Ser Asp Thr Leu Leu Val 

340 345 350 

CAA GAG CCG TCT TGA 1071 
Gin Glu Pro Ser 
355 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 356 Amino acids 

(B) TYPE: Amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gin Thr Arg Lys He Val Arg Ala Ala Ala Val Gin Ala Ala Ser 
15 10 15 



Pro Asn Tyr Asp Leu Ala Thr Gly Val Asp Lys Thr He Glu Leu Ala 

20 25 30 
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Arg Gin Ala Arg 
35 

Trp Leu Pro Gly 
50 

Ser Leu Lys Tyr 
65 

Ser Ala Glu Phe 



Phe lie Ala Leu 

100 

Gly Gin Cys Leu 
115 

Lys Leu Lys Pro 
130 

Ala Arg Asp Leu 
145 

Leu Cys Cys Trp 

Ser Gin His Glu 

180 

Tyr Ser Glu Gin 
195 

Ser Gin lie Tyr 
210 

Ser Val Val Thr 
225 

Asn Ala Pro Leu 



Pro Asp Gly Arg 

260 

Leu lie lie Ala 
275 

lie Asn Asp Pro 
290 

Val Leu Asp Leu 
305 



31 

Asp Glu Gly Cys 

40 

Tyr Pro Phe His 
55 

Ser Ala Arg Tyr 
70 

Gin Arg lie Ala 
85 

Gly Tyr Ser Glu 



lie Asp Asp Lys 

120 

Thr His Val Glu 
135 

lie Val Ser Asp 
150 

Glu His Leu Ser 
165 

Ala lie His lie 



Ala His Ala Leu 

200 

Ser Val Glu Gly 
215 

Gin Glu Thr Leu 
230 

Leu Lys Val Gly 
245 

Thr Leu Ala Pro 



Asp Leu Asn Met 

280 

Val Gly His Tyr 
295 

Gly His Arg Asp 
310 



Asp Leu lie Val 



Val Trp Leu Gly 

60 

Tyr Ala Asn Ser 
75 

Gin Ala Ala Arg 
90 

Arg Ser Gly Gly 
105 

Gly Glu Met Leu 



Arg Thr Val Phe 

140 

Thr Glu Leu Gly 
155 

Pro Leu Ser Lys 
170 

Ala Ala Trp Pro 
185 

Ser Ala Lys Val 



Gin Cys Phe Thr 

220 

Asp Met Leu Glu 
235 

Gly Gly Ser Ser 
250 

Tyr Leu Pro His 
265 

Glu Glu lie Ala 



Ser Lys Pro Glu 

300 

Pro Met Thr Arg 
315 




Phe Gly Glu Thr 
45 

Ala Pro Ala Trp 



Leu Ser Leu Asp 

80 

Thr Leu Gly lie 
95 

Ser Leu Tyr Leu 
110 

Trp Ser Arg Arg 
125 

Gly Glu Gly Tyr 

Arg Val Gly Ala 

160 

Tyr Ala Leu Tyr 
175 

Ser Phe Ser Leu 
190 

Asn Met Ala Ala 
205 

lie Ala Ala Ser 



Val Gly Glu His 

240 

Met lie Phe Ala 
255 

Asp Ala Glu Gly 
270 

Phe Ala Lys Ala 
285 

Ala Thr Arg Leu 



Val His Ser Lys 

320 
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Ser Val Thr Arg Glu Glu Ala Pro Glu Gin Gly Val Gin Ser Lys lie 

325 330 335 

Ala Ser Val Ala lie Ser His Pro Gin Asp Ser Asp Thr Leu Leu Val 

340 345 350 

Gin Glu Pro Ser 
355 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 Amino acids 

(B) TYPE: Amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Peptide 
(iii) HYPOTHETIC: NO 

(iii) [sic] ANTISENSE: NO 

(v) FRAGMENT TYPE: N Terminus 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Alcaligenes faecalis 

(B) STRAIN: 1650 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Nitrilase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Gin Thr Arg Lys lie Val Arg Ala Ala Ala Val Gin Ala Ala Ser 
15 10 15 

Pro Asn Tyr Asp Leu Ala Thr Gly Val Asp Lys Thr He Glu Leu Ala 

20 25 30 

Arg Gin Ala Arg Asp Glu Gly 
35 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 Amino acids 

(B) TYPE: Amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Peptide 

(iii) HYPOTHETIC: NO 



(iii) [sic] ANTISENSE: NO 
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(v) FRAGMENT TYPE: Internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Alcaligenes faecalis 

(B) STRAIN: 1650 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Nitrilase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Glu Glu Ala Pro Glu Gin Gly Val Gin Ser Lys lie Ala Ser Val Ala 
15 10 15 

lie Ser His Pro Gin 

20 

3 (2) INFORMATION FOR SEQ ID NO: 5: 

30 (i) SEQUENCE CHARACTERISTICS: 

^ (A) LENGTH: 11 Amino acids 

(B) TYPE: Amino acid 
fl (D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: Peptide 

(iii) HYPOTHETICAL: NO 

3 (iii) [sic] ANTISENSE: NO 

O (v) FRAGMENT TYPE: Internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Alcaligenes faecalis 

(B) STRAIN: 1650 

(vii) IMMEDIATE SOURCE: 

( B ) CLONE : Nitrilase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Glu Glu Ala Pro Glu Gin Gly Val Gin Ser Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 Base pairs 

(B) TYPE: Nucleic acid 

( C ) STRANDEDNESS : Single 

( D ) TOPOLOGY : Linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) [sic] ANTISENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Alcaligenes faecalis 

(B) STRAIN: 1650 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Nitrilase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

ATGC AG AC N A GNAARATCGT SCG 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 Base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single 

( D ) TOPOLOGY : Line ar 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iii) [sic] ANTISENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Alcaligenes faecalis 

(B) STRAIN: 1650 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Nitrilase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TNGCSACNGA NGCRATCTTG 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 Base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iii) [sic] ANTISENSE: NO 



0050/49462 




35 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Alcaligenes faecalis 

(B) STRAIN: 1650 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Nitrilase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTAATCATAT GCAGACAAGA AAAATCGTCC G 31 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 Base pairs 

(B) TYPE: Nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) [sic] ANTISENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Alcaligenes faecalis 

(B) STRAIN: 1650 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Nitrilase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



AAGGATCCTC AAGACGGCTC TTGCACTAGC AG 
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